AITopics | epsilon -greedy exploration

Collaborating Authors

epsilon -greedy exploration

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

On the Convergence and Sample Complexity Analysis of Deep Q-Networks with \epsilon -Greedy Exploration

Neural Information Processing SystemsDec-24-2025, 08:36:32 GMT

This paper provides a theoretical understanding of deep Q-Network (DQN) with the $\varepsilon$-greedy exploration in deep reinforcement learning.Despite the tremendous empirical achievement of the DQN, its theoretical characterization remains underexplored.First, the exploration strategy is either impractical or ignored in the existing analysis. Second, in contrast to conventional Q-learning algorithms, the DQN employs the target network and experience replay to acquire an unbiased estimation of the mean-square Bellman error (MSBE) utilized in training the Q-network. However,the existing theoretical analysis of DQNs lacks convergence analysis or bypasses the technical challenges by deploying a significantly overparameterized neural network, which is not computationally efficient. This paper provides the first theoretical convergence and sample complexity analysis of the practical setting of DQNs with $\epsilon$-greedy policy. We prove an iterative procedure with decaying $\epsilon$ converges to the optimal Q-value function geometrically. Moreover, a higher level of $\epsilon$ values enlarges the region of convergence but slows down the convergence, while the opposite holds for a lower level of $\epsilon$ values. Experiments justify our established theoretical insights on DQNs.

convergence and sample complexity analysis, deep q-network, epsilon -greedy exploration, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Understanding Deep Neural Function Approximation in Reinforcement Learning via \epsilon -Greedy Exploration

Neural Information Processing SystemsDec-23-2025, 21:38:34 GMT

This paper provides a theoretical study of deep neural function approximation in reinforcement learning (RL) with the $\epsilon$-greedy exploration under the online setting. This problem setting is motivated by the successful deep Q-networks (DQN) framework that falls in this regime. In this work, we provide an initial attempt on theoretical understanding deep RL from the perspective of function class and neural networks architectures (e.g., width and depth) beyond the ``linear'' regime. To be specific, we focus on the value based algorithm with the $\epsilon$-greedy exploration via deep (and two-layer) neural networks endowed by Besov (and Barron) function spaces, respectively, which aims at approximating an $\alpha$-smooth Q-function in a $d$-dimensional feature space.

deep neural function approximation, epsilon -greedy exploration, reinforcement learning, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.86)

Add feedback

On the Convergence and Sample Complexity Analysis of Deep Q-Networks with \epsilon -Greedy Exploration

Neural Information Processing SystemsOct-10-2024, 17:29:36 GMT

This paper provides a theoretical understanding of deep Q-Network (DQN) with the \varepsilon -greedy exploration in deep reinforcement learning.Despite the tremendous empirical achievement of the DQN, its theoretical characterization remains underexplored.First, the exploration strategy is either impractical or ignored in the existing analysis. Second, in contrast to conventional Q-learning algorithms, the DQN employs the target network and experience replay to acquire an unbiased estimation of the mean-square Bellman error (MSBE) utilized in training the Q-network. However,the existing theoretical analysis of DQNs lacks convergence analysis or bypasses the technical challenges by deploying a significantly overparameterized neural network, which is not computationally efficient. This paper provides the first theoretical convergence and sample complexity analysis of the practical setting of DQNs with \epsilon -greedy policy. We prove an iterative procedure with decaying \epsilon converges to the optimal Q-value function geometrically. Moreover, a higher level of \epsilon values enlarges the region of convergence but slows down the convergence, while the opposite holds for a lower level of \epsilon values.

convergence and sample complexity analysis, deep q-network, epsilon -greedy exploration, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Understanding Deep Neural Function Approximation in Reinforcement Learning via \epsilon -Greedy Exploration

Neural Information Processing SystemsOct-10-2024, 06:45:09 GMT

This paper provides a theoretical study of deep neural function approximation in reinforcement learning (RL) with the \epsilon -greedy exploration under the online setting. This problem setting is motivated by the successful deep Q-networks (DQN) framework that falls in this regime. In this work, we provide an initial attempt on theoretical understanding deep RL from the perspective of function class and neural networks architectures (e.g., width and depth) beyond the linear'' regime. To be specific, we focus on the value based algorithm with the \epsilon -greedy exploration via deep (and two-layer) neural networks endowed by Besov (and Barron) function spaces, respectively, which aims at approximating an \alpha -smooth Q-function in a d -dimensional feature space. Moreover, for a two layer neural network endowed by the Barron space, scaling the width \Omega(\sqrt{T}) is sufficient.

deep neural function approximation, epsilon -greedy exploration, reinforcement learning, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.80)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.67)

Add feedback